Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add rate limit HTTP service for external rate limits #1460

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

minrk
Copy link
Member

@minrk minrk commented Mar 8, 2022

Related to jupyterhub/mybinder.org-deploy#2143

  • Moves repo quotas from concurrent launches to a time-based rate limit
  • Allows running rate limiters in a separate HTTP endpoint, so they can be shared across federation members

The idea:

  1. run one instance of the rate limiter somewhere (e.g. in prime federation member), expose this as a service (add it to the binderhub chart, using the same image as binderhub itself)
  2. issue token(s) for access to the rate limiter
  3. set rate_limit_url and rate_limit_token for all federation members

The rate limit implementation is the same (RateLimiter objects), but wrapped in an HTTP request when using the shared impelementation.

Notably, this weighs sessions that only last 10 seconds equally with sessions that last 3 hours. Currently active sessions does map onto 'cost' more directly, so maybe it is better to keep the concurrent session quota.

Still lots to do (tests and docs and such), but a sketch worth talking about

Moves repo quotas from concurrent launches to a time-based rate limit
@minrk
Copy link
Member Author

minrk commented Mar 9, 2022

This also solves the same problem for a single binderhub with multiple replicas, where rate limits are per-replica instead of per deployment.

An alternative to the custom HTTP endpoint is to use an external store like redis/etcd. It would add a dependency, but might be the better, more general approach.

@manics
Copy link
Member

manics commented Mar 9, 2022

I feel like this gets into the bigger question of how to manage BinderHub's long term maintainability and scalability. We've done a bit of work on decoupling some components (e.g. running in K8s vs Docker).... do we split out more components and deal with their orchestration with Helm/terraform/etc, or build things into BinderHub since it's easier to deploy?

@minrk
Copy link
Member Author

minrk commented Mar 9, 2022

I'm increasingly leaning toward directly supporting redis as the storage for rate limits. It would be a lot less code to maintain, and deploying a simple redis instance is not a big deal. I'd do that at the mybinder.org-deploy level, rather than binderhub, though. binderhub would only need to take redis connection info.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants